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Antiion Optimization (ALO) is one of the latest population based 
optimization methods that proved its good performance in a variety of 
applications. The ALO algorithm copies the hunting mechanism of antiions 
to ants in nature. Community detection in social networks is conclusive to 
understanding the concepts of the networks. Identifying network 
communities can be viewed as a problem of clustering a set of nodes into 
communities, k-median clustering is one of the popular techniques that has 
been applied in clustering. The problem of clustering network can be 
formalized as an optimization problem where a qualitatively objective 
function that captures the intuition of a cluster as a set of nodes with better in 
temal connectivity than external connectivity is selected to be optimized. In 
this paper, a mixture antiion optimization and k-median for solving the 
community detection problem is proposed and named as K-median 
Modularity ALO. Experimental results which are applied on real life 
networks show the ability of the mixture antiion optimization and k-median 
to detect successfully an optimized community structure based on putting the 
modularity as an objective function. 
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1. INTRODUCTION 

Technology and social network have presented significant tools for learning and advance 
continually. These technologies have enabled the rapid development of life [1]. Persons in the social 
networks form a relation frame through different connections which produces a large amount of 
dissemination of information. The relation frame is called by community. Community detection is great 
important to discover the structure of social networks, analysis the information, and understanding as well as 
control the public sentiment. Community can be treated as a summary of the whole network thus soft to 
visualize and understand [2]. 

Communities are groups of members (nodes) that are connected heavily inside the group but 
connected sparely with the rest of the network. Community detection in large net-works is potentially very 
advantageous to member, where nodes belonging to a tight-knit community are more than likely to have 
other attributes in common. Eor example on the World Wide Web (WWW) a cluster can be looked at as 
information or as physical links and paths connecting to each other [3, 4]. Song et al. in [5] applied discrete 
Bat Algorithm to the community detection of showing networks and achieved good results. Hafez et al. in [6] 
used Genetic Algorithm (GA) as an effective optimization technique to solve the community detection 
problem as a single-objective and multi-objective problem, they used the most popular objectives proposed 
over the past years, and they showed how those objective correlate with each other. Eortunato and Hric in [7] 
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proposed algorithm to guide tour through the main features of problem. They pointed out strengths and 
weaknesses of popular methods, and gave directions to their use. Newman in [8] used optimization methods 
or approximation method such as spectral method to solve community detection problem. Newman and 
Girvan in [9] introduced a new technique called Modularity. Pizzuti in [10] proposed a genetic based 
approach to discover communities in social networks. Their algorithm was a simplex but effective fitness 
function able to identify densely connected groups of nodes with sparse connections between groups. 
Honghaoet et al. in [11] suggested an ant colony optimization (AGO) based approach to discover 
communities. They demonstrated that ACO based approach results in a significant enhancement in 
modularity values as compared to existing heuristics in the literature. Masdarolomoor et al. in [12] proposed 
a new method for community detection in networks and used to simulated annealing to maximize the 
modularity. Their algorithm was evaluated by modularity metric and worked better in time and accuracy 
compared to similar methods. Barawy et al. in [13] presented the idea of using the results of an opt mization 
algorithms Particle Swarm Optimization (PSO) and Exponential Particle Swarm Optimization EPSO as input 
to the k-means clustering algorithm in order to have a well community detection for social network data. 
Naem et al. in [14] proposed a hybrid cat swarm optimization and k-median for solving the problem of 
community detection. In this paper, a mixture antiion optimization and k-median for solving the community 
detection problem is proposed and named as K-median Modularity ALO. Where using k-median clustering to 
detect the community and ALO for optimizing the modularity. By setting the modularity as an objective 
function in order to have high value for modularity as well community detection for social network data. 

The remainder of the paper is organized as follows: Brief introduction on community detection 
problem, antiion swarm optimization algorithm, and k-median algorithm are introduced in Sections 2. The 
details of the proposed method are presented in Section 3. Section 4 shows our experimental results on 
datasets social networks. Einally, we supply conclusions in Section 5. 


2. PRELIMINARIES 

2.1. Community detection problem 

Detecting the hidden community in social networks, which is conclusive to understand the faces of 
networks, is vary important object in Social Network Analysis. Community detection algorithms aim to find 
communities based on the network structure to existing groups of nodes that are heavily connected [15]. So, 
to evaluate the clustering performance, modularity metric is put into use as a measure for the quality of 
communities of the network [16]. 

Modularity function was proposed by Girvan and Newman in 2004 [9], modularity measure was 
designed to measure the strength of division of a network into clusters (modules or communities). The 
modularity way detects communities by searching over possible divisions of a network for one or more that 
have particularly best modularity. 

Suppose we have a network that includes n vertices, and let the number of edges between vertices i 
and j be Ay, which will usually be 0 or 1, so the quantities Ay are the elements of the so called 
adjacency matrix. 

Concurrently, the expected number of edges between vertices i and j if edges are placed at random is 
ki*kj/2m, where ki and kj represent the degrees of the vertices and m is the total number of edges in the 
network. So the modularity can be formalized as (1) [16]: 




(I) 


where: Q represents the modularity of network; 

Hi and Hj are the identity of the community which the node i and j belong to in certain iteration 
respectively. If vertices i and j are in the same community, =1, else 0. 

2.2. Antiion optimization algorithm 

Antiion Optimization (ALO) [17] is a new nature inspired algorithm presented by Mirjalili in 2015. 
Mirjalili depends on the following facts and assumptions in the artificial antiion optimization algorithm: 

a. Ants change position around the search space by using different random walks; 

b. Random walks are influenced by the snares of antiions; 

c. Antiions can make snares proportional to their fitness (the higher the fitness, the greater the hole); 

d. Antiions with greater holes have a higher probability of catching ants; 

e. Every ant can be caught by an antiion in all iterations; 

f. The range of random walks is reduced adaptively to simulate sliding ants across antiions; 
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g. If ant becomes fitter than antiion, this means that the ant is caught and pulled toward the hole by the 
antiion; 

h. Antiion repositions to the most recently caught prey and builds a hole to update its chance of catching 
another prey after each hunt. 

ALO has very distinguished outcomes in fields of exploitation, local optima avoidance, and convergence. 
Mathematical modeling of the ALO algorithm can be formulated as in the following items [18]: 

Random walks: Random walks of ants when searching for food in nature can be for mulated as 

follows: 


X (t)=[0, cumsum (2r (tl)-l), cumsum (2r (t2)-l)... cumsum (2r (tn)-l)] (2) 


where: n shows the maximum number of iterations; 
cumsum computes the cumulative sum; 
t denotes to the step of random walk; 
r(t) shows a stochastic function and is given by: 

(1, if random > 0.5 
[0, if random < 0.5 


Where random represents a random number and it falls in [0, 1]. 
The place of ants is created with this matrix: 


^Ant 


^1,1 

■■ ^l,b 

■An.l 

^n,b 


(4) 


Here: is the matrix for utilizing the position of every ant; 

Aj y acts the value of the jth variable of ith ant; 
d denotes to the number of variables; 
n represents to the number of ants. 

The place of antiions is created with this matrix: 


ALii • 

■■ ALi j, 


■ ■ AL„ j, 


(5) 


Here: is the matrix for utilizing the position of every antiion; 

ALj y acts the value of the jth variable of ith antiion; 
n denotes to the number of antiions; 
d denotes to the number of variables. 

The random walks of ants inside the search space are normalized using this equation: 

yt _ „ I (Xj - aj) ) 


(6) 


where: ajshows the minimum of random walk of ith variable; 
dj denotes the maximum of random walk of ith variable; 
cf shows the minimum of ith variable at tth iteration; 
b[ is the maximum of ith variable in tth iteration. 

Trapping in pit: Mathematical modeling of ants trapping in antiion’s pits is given by 
these equations: 

cf=Antliont-H- 
d\ — Antliontj-t d^ 

where: acts the minimum of all variables at tth iteration; 

d^ acts the maximum of all variables at tth iteration; 
cf shows the minimum of all variables for ith ant; 
df shows the maximum of all variables for ith ant; 

Antliontj presents the position of the chosen jth antiion at tth iteration. 
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Building trap: A roulette wheel is used to get higher probability for catching ants. This technique 
identifies fittest antiions. 

Sliding ants towards antiion: Antiions exit the sands out the center of the hole so any ant trying to 
escape slide down the trap. The radius of the ant’s random walks hypersphere is reduced according to 
these equations: 




U 

~U 


( 8 ) 


where:U is a ratio and is calculated by the following equation: 


u = ioy *- 

j 


(9) 


where: t indicates to the current iteration; 

T presents the maximum number of iterations; 

w denotes to a constant and is defined based on T; t where (w=2 when t > 0.1 T, w=3 when t > 0.5 T, 
w=4 when t > 0.75 T, w=5 when t > 0.9 T, and w=6 when t > 0.95 T). 

Catching ant and re-building pit: When the ant arrives to the bottom of the hole and is captured 
this is the final step of hunting. According to the last position, the antiions update its position 
by this equation: 

Antliont- = Ant- if/(Ant-) >/(Antliont-) (10) 

where: Antliont^ acts the position of the selected jth antiion at tth iteration; 

Ant; is the position of the selected ith ant at tth iteration; 
t is the current iteration. 

Elitism: It is very important in evolution algorithm where it is to maintain best solution. 

This can be modeled as the following equation: 


Antf = 


2 


( 11 ) 


where: indicates to the random walk around the antiion selected by the roulette wheel at tth iteration; 

indicates to the random walk around the elite antiion at tth iteration. 


2.3. K-medians clustering algorithm 

Clustering is a standard procedure in multivariate data analysis used for communication and is 
designed to explore communities structure of the data objects. Clustering has multi methods such as k-means 
and k-medians [19]. The algorithm of k-medians clustering is symmetric to k-means clustering algorithm, but 
k-medians updates the cluster center by calculating the median of the same cluster to be the new cluster 
center, k-medians is sensitive to the initialization points of its k centers, every center having the tendency to 
remain roughly in the same cluster in which it is first situation [20]. The distance between each point of data 
and cluster center is evaluated using this equation: 


d(x, c) = \\x — c|| 


( 12 ) 


Where x point in data and c cluster center. The k-medians algorithm attempts to make k disjoint cluster that 
minimize the following equation: 

t/ = i:f=ii:xEDik-Qii (13) 

where: x=member of data D. 

Cj=cluster center i. 
k=number of clusters. 
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3. THE PROPOSED METHOD 

The proposed method (K-median Modularity ALO) in this research consists of two main parts, 
clustering the data by using k-median and looking for the best modularity as well community detection for 
social network dataset by applying ALO algorithm. Steps of proposed method are summarized down: 

Step 1: Defining the initial cluster center 

Randomly select k points from the data points to be the initial cluster centers. 

Step 2: Grouping data into clusters 

Place the data into clusters with the closest cluster center by using (13). 

Step 3: Calculating the modularity 

Modularity is the fitness function in this method. The value of modularity is calculated using (1). 

Step 4: Optimization with ALO 

We apply the ALO algorithm on cluster centers to get best modularity value and best cluster centers for each 
data. 

Step 5: Repeat steps 

Recurrence steps 2-4 until it reaches the stop criteria, after update the clusters centers by calculating the 
median for each cluster. The steps of proposed method can summarize in Figure 1. 


Apply clustering process to detect 
the communities 




Calculate communities' modularity 




AppiyALO algorithm to optimize 
the modularity 


Figure 1. Proposed model 


4. EXPERIMENTAL RESULTS 
4.1. Datasets 

In the section, the proposed method (K-median Modularity ALO) is applied on four real life social 
network datasets: 

The Zachary Karate Cluh: 

It was studied by Wayne W. Zachary from 1970 to 1972 and was observed from the members of a 
university karate club. The graph of Karate network composes of 34 nodes and 78 edges. Where each 
member of the club was represented by node and each relation between two members in the club was 
represented by edge. The problem was often discussed the use of this dataset to find groups of people after a 
struggle arose between teachers, which led to the divide the karate club into two group [21]. 

The Bottlenose Dolphins network: 

This was named by social network of bottlenose dolphins and was aggregated by Lusseau from 
1994 to 2001. The network was built the basis of the behavior of 62 bottlenose dolphins that live in Doubtful 
Sound, New Zealand. So the nodes represent the bottlenose dolphins and edges represent a frequent 
associations [22]. 

American College football network: 

It represented American football games between American colleges during a regular season in fall 
2000, as reorganized by M. Girvan and M. Newman. Each node in the network refers to team and edge refers 
to game of regular season between the two teams they connect [23]. 

The Polbooks network: 

It network of books about US politics showed at the time of the 2004 presidential election and 
marketed by the online bookseller Amazon.com. Edges between books act frequent co purchasing of books 
by the same shoppers. This network was combined by V. Krebs [24]. 

Table 1 contains of information for the previously introduced datasets. 
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Table 1. Comparison of modularity results 


Optimization Methods 

Karate 

Dolphins 

Data 

Football 

Polbooks 

GN [25] 

0.401 

0.519 

0.599 

0.516 

FN [25] 

0.380 

0.489 

0.577 

0.502 

BGLL [25] 

0.418 

0.518 

0.602 

0.498 

HSCDA [25] 

0.419 

0.527 

0.602 

0.527 

PSO K-means [14] 

0.433 

0.445 

0.529 

0.465 

PSO K-median [14] 

0.442 

0.461 

0.566 

0.480 

BAT K-means [14] 

0.449 

0.501 

0.583 

0.50 

BAT K-median [14] 

0.470 

0.472 

0.614 

0.51 

CSO K-means [14] 

0.451 

0.472 

0.593 

0.53 

CSO K-median [14] 

0.489 

0.502 

0.621 

0.559 

ALO K-means 

0.469 

0.519 

0.612 

0.557 

ALO K-median 

0.501 

0.525 

0.628 

0.563 


4.2. Results and analysis 

The results of modularity after applying the proposed method (K-median Modularity ALO) are 
compared to K-means Modularity PSO, K-means Modularity Bat optimization, K-means Modularity CSO, 
K-median Modularity PSO, K-median Modularity Bat optimization, K-median Modularity CSO, GN, FN, 
BGLL, HSCDA [25] and K-means Modularity ALO on the real datasets. Since modularity is a famous 
community quality measure used vary widely in community detection, and it is used as a quality measure for 
the result community structure of all other achieves. Also Normalized Mutual Information (NMI) compares 
the accuracy of the outcome communities where it computes the likeness between two parts. NMI is 
described in the following equation [25]: 


NM1{X,Y) 


-2 -Ii Cijlog{CijN/Ci Cj) 
Qlog(Q/iV) -b tjlr Cj\og{Cj/N) 


where: NMI{X, Y ) Denotes NMI for two parts X and Y; 

Cij : Number of nodes assigned to ith community in part X, and jth community inpart Y; 

Cj: Number of nodes in part X assigned to ith community; 

Cj : Number of nodes in part Y assigned to jth community; 

Cx'. The Community Number in part X; 

Cy'. The Community Number in part Y; 

N: Total number of nodes. 

If NMI equals 1 this means that the two consequences consistent completely. If NMI equals zero this means 
that the two consequences inconsistent completely. 

Applying the proposed algorithm on previously introduced datasets. On obtaining the results 50 
consecutive runs on each datasets the average modularity and NMI are calculated. The results of modularity 
in Table 1; where it can be concluded that the modularity obtained by K-median Modularity ALO is better 
than that obtained by another methods when compered to results perviously obtained on [14] and [25] except 
in case of dolphins network with HSCDA get better result. Table 2 illustrates the experimental results of 
NMI. It confirms that the proposed method K-median Modularity ALO outperforms over other methods 
applied on [14] and [25] and K-means Modularity ALO. 


Table 2. Comparison of NMI results 



Optimization 

Methods 

Data 

PSO 

BAT 

CSO 

ALO 


k-median 

k-median 

k-median 

k-median 

Karate 

0.61 

0.68 

0.71 

0.84 

Dolphins 

0.50 

0.53 

0.61 

0.68 

Football 

0.79 

0.82 

0.86 

0.89 

Polbooks 

0.52 

0.56 

0.59 

0.67 


Figure 2-5 discuss the relation between modularity and community number for Karate network. 
Dolphins network. Football network, and Polbooks network respectively. It is obvious that when the 
community number equals 4, the Karate network and Dolphins network get the maximum modularity. And 
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when community number equals 9, the Eootball network obtains the maximum modularity. Also when 
community number equals 3, the Polbooks network obtains the maximum modularity. 



Community number 


Q 

O 4S 

0.4 

0.35 

0.3 


' ♦ ' QCSO 
■ Q BAT 
m Q P50 
m Q ALO 


Eigure 2. Relation between modularity Q and 
Community number for Karate network 


Eigure 3. Relation between modularity Q and 
Community number for Dolphins network 



2343<789 10 

Community rnjmber 



Eigure 4. Relation between modularity Q and Eigure 5. Relation between modularity Q and 

Community number for Poolbooks network Community number for Eootball network 


5. CONCLUSION 

Community detection is great of important in computer science, biology, physics and sociology to 
understand of complicated systems. This problem is very challenging and not yet satisfactorily solved 
despite many methods have been proposed. The k-median Modularity ALO is successfully implementing of 
NMI measure on networks confirmed this result when compered to other optimization methods. 
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